Selection of noise level in strategy adoption for spatial social dilemmas 
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We studied spatial Prisoner's Dilemma and Stag Hunt games where both the strategy distribution and the 
players' individual noise level could evolve to reach higher individual payoff. Players are located on the sites of 
different two-dimensional lattices and gain their payoff from games with their neighbors by choosing uncondi- 
tional cooperation or defection. The way of strategy adoption can be characterized by a single K (temperature- 
like) parameter describing how strongly adoptions depend on the payoff-difference. If we start the system from 
a random strategy distribution with many different player specific K parameters, the simultaneous evolution of 
strategies and K parameters drives the system to a final stationary state where only one K value remains. In the 
coexistence phase of cooperator and defector strategies the surviving K parameter is in good agreement with 
the noise level that ensures the highest cooperation level if uniform K is supposed for all players. In this paper 
we give a thorough overview about the properties of this evolutionary process. 

PACS numbers: 89.65.-s, 89.75.Fb, 87.23.Kg 



I. INTRODUCTION 

Evolutionary game theory has attracted great interest re- 
cently from many scientific fields (H 0. [H 0, HI . Physicists, 
biologists, economists and many other scientists have found it 
challenging to study multi-agent evolutionary systems. 

The Prisoner's Dilemma (PD) game is an excellent 

toy model to describe the sharpest conflict situation between 
individual and common interest as it contains all the basic fea- 
tures of such interactions. The original PD game is a two- 
person one-shot game where players can choose between two 
types of behavior, to cooperate or to defect. They earn payoffs 
according to the simultaneous decisions of both participants. 
A cooperator gets the 'sucker's payoff (S) against a defector, 
while successful defection yields the temptation to defect (T) 
for the defector. Mutual cooperation is rewarded by R for each 
player, while two defectors (receiving payoff P) punish each 
other with the defective behavior. The game is a classical PD 
game if the payoffs accomplish the relation T > R > P > S. 
This inequality causes two selfish (rational) players to defect 
independently of the other players decision, thus they get the 
second worst payoff instead of the second best one for mutual 
cooperation resulting in a dilemma situation. If R > T, the 
social dilemma is weakened because the unilateral deviation 
from the mutual cooperation is not beneficial [7]. In the latter 
so-called Stag Hunt (SH) game to act identically to the part- 
ner's strategy would result the highest payoff similarly to the 
Coordination game. 

The introduction of spatiality J^] revealed fundamentally 
new solutions of the game, which cannot be detected in the 
well-mixed situation. In spatial evolutionary models, players 
are located on the sites of a network where the links of the 
network define their possible connections. Players gain their 
accumulated payoff from games with their immediate neigh- 
bors and sometimes - according to the evolutionary dynamics 
- they can adopt the strategy of a neighbor. Usually the strat- 
egy adoption probability depends on the payoff difference and 
in accordance with the Darwinian principle, individuals with 
higher payoff (fitness) have a greater chance to supersede the 
less successful ones. Due to the spatial scenario, cooperation 



can survive in the system, even if only the simplest strate- 
gies are allowed, i.e. unconditional cooperation and defec- 
tion. Here, cooperators form clusters and support each other 
through the short range local interactions while defectors pun- 
ish each other with mutual defection. The invasion processes 
along the borders of the clusters depend on the irregularity of 
the interface, the underlying network(s), and the evolutionary 
dynamical rule. 

The PD game was studied on many types of networks (dif- 
ferent lattices i9l [Tol[Tlll . scale-free graphs fl2tl . small world 
networks fl3ll . etc.), investigating the effect of basic topolog- 
ical features on the measure of cooperation. To reduce the 
number of degrees of freedom, evolving networks B14ll were 
examined, too. In these models, players have the opportunity 
to change their neighbors during the evolutionary process to 
attempt to increase their income, i.e., the strategy distribution 
and the connectivity graph co-evolve. As a result of this evo- 
lution, highly cooperative communities could be established. 

The simultaneous change of strategy and a player-specific 
parameter, has been extended to other quantities, too. For ex- 
ample, Fort [15] studied models where the elements of payoff 
matrix were inherited in parallel with the strategy adoption. 
The evolution of the strategy pass capability 1 16] or the inter- 
action range of players 11711 are also discussed. Moyano and 
Sanchez fll8fl studied the competition between several pairs of 
dynamical rules controlling the strategy adoption in the sys- 
tem. The latter results have motivated us to study the evolution 
of a noise-related parameter 11911 . which frequently character- 
izes the uncertainty in the adoption process IU1I1 . Many things 
can cause this uncertainty, such as temporal or spatial fluctua- 
tions in the payoff values, errors in decision or in perception, 
emotions, individual point of view (free will), etc. The role of 
the noise parameter at different under lying gr aphs was stud- 
ied thoroughly in several studies J20l l2lll22ll . It turned out 
that on structures that can be fully perambulated by stepping 
only on overlapping triangles (i.e., on structures with triangle 
percolation), cooperation could be maintained in the widest 
parameter range when the measure of noise was minimal 12011 . 
While on structures without triangle percolation, the optimal 
measure of noise for cooperation is shifted to positive values. 
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In this article, we study the simulatenous evolutions of 
noise parameter - as the quantity characterizing the adoption 
rule - and strategy distributions on different, relevant struc- 
tures. Square lattice and the kagome lattice will be analyzed 
as good examples for graphs with and without triangle perco- 
lation. In our model, players can apply different noise param- 
eter and during the evolutionary process, they can adopt not 
only the strategy of another neighboring player but also the 
individual noise parameter. In other words, a player can learn 
not only a more successful strategy but also the way how suc- 
cessful player reacts to the payoff differences, i.e. his adoption 
rule. The two elementary, evolutionary steps are independent. 
We will show that as a result of this dynamics, only one noise 
level remains in the final state even if several strategies can co- 
exist in the co-existence region and the remaining noise level 
is close to the one providing the highest cooperation for sys- 
tems with homogeneous noise distribution. In a previous letter 
ifisitl we have considered only the case of weak PD. Now the 
latter investigation is extended to the parameters over the lim- 
its of the weak PD and SH games by revealing the connection 
of cooperation density surface and the location of fixed noise 
level. Finally we briefly discuss what happens if not only the 
strategies and the way of strategy adoption but also the pay- 
off matrix are allowed to adopt throughout the same imitation 
mechanism. We used Monte Carlo (MC) simulations and an 
extended version of the dynamical mean-field approximation 
(detailed here) to perform the investigations. 



II. THE MODEL 

In our model, players are located on the sites a; of a square 
(consisting of L x L sites with periodic boundary conditions) 
or a kagome lattice (3xixL sites). These interaction graphs 
can be used as the two representatives of the characteristic fea- 
tures of two-dimensional lattices, i.e. lattices with and with- 
out triangle percolation. Players can follow one of the sim- 
plest strategies that is unconditional cooperation (s x = C) 
or unconditional defection (s x = D). They gain their cumu- 
lated payoffs from one-shot PD games with their four nearest 
neighbors. To reduce the necessary parameters we are us- 
ing a re-scaled payoff matrix suggested by Nowak and May 
112311 where the reward of mutual cooperation is R = 1 while 
mutual defection yields P = income. A cooperator gains 
5 = payoff when facing a defector while a successful de- 
fector gets the temptation to defect (T = b). We investigate 
an extended parameter space < b < 2 to explore the SH 
region, too, while S 1 = refers to the so-called weak PD for 
1< b < 2. 

For the MC simulations, we use random initial strategy dis- 
tribution where both C and D strategies are present with the 
same frequency. Beside the strategy, each player possesses 
another parameter describing his willingness to make rational 
decision: having individual adoption rule (K x ). This parame- 
ter can be interpreted as a personal noise parameter as it con- 
tains the possible uncertainty factors in the strategy adoption. 
Such factors can emerge from the fluctuation of payoff param- 
eters, changing of environment, errors in decision, individual 



freedom to risk a given amount of income, etc. depending on 
the situation which is modelled. In our model, initially, we 
associate an adoption parameter K x to every player from a fi- 
nite set, that is, K x 6 {Ki, K2, ■ ■ ■ , K n } where n denotes the 
number of different K values. 

During the evolutionary process, we choose two neighbor- 
ing players (x and y) randomly, and we calculate their ac- 
cumulated payoffs (P x and P y ) gained from PD games with 
their neighbors. In an elementary evolutionary step, player x 
can adopt the strategy s y and/or the noise value K y of player 
y with the probability 



1 + CX P [{P X - Py)/K x ] ' 

The possible adoption of the strategy and the noise value hap- 
pens independently of each other, i.e., it is possible that only 
one of them is adopted in an elementary step. As the Dar- 
winian principle dictates, for P y — P x ^S> K x , both the strategy 
and the adoption rule of player y is very likely to be adopted. 

According to the proposed protocol, the adoption rule of 
player y (K y ) can still be adopted even if the strategies are the 
same (s x = s y ). As a consequence, the adoptions of strate- 
gies and rules can end independently arriving to one of the 
absorbing states formed by identical strategies and/or uniform 
adoption rules. In the absence of mutation the system cannot 
leave these states. 

In general, the existence of many absorbing states can cause 
technical difficulties in the interpretation of numerical results 
achieved on small systems. For small sizes the system evolve 
quickly into one of the absorbing states despite the existence 
("long-time stability") of a mixed state in the limit L — ► 00. 
These difficulties can be avoided by using sufficiently large 
system sizes that increases the duration of simulations. We 
will show, however, that the significantly faster simulations 
on small systems can also be utilized to extract accurate quan- 
tities characterizing the behavior of the present evolutionary 
games. 

As we stressed, we used the same adoption probability for 
both evolving quantities determined by Eq. Q] However, the 
time scales of evolutions can be distinguished, namely the 
strategy or the noise parameter may evolve faster or slower. 
Such a time scale separation of coevolving quantities was al- 
ready studied in several earlier works lfl~4ll24h . An interesting 
observation was the shift of effective payoff elements if the 
link of players, as an evolving quantity, change much faster 
than their strategies. In our case the time separation can be 
easily done by adding a multiplicative < Q < 1 prefactor to 
the transition rate of evolving quantity resulting a slower evo- 
lution comparing to the other one. Simulations show the fixed 
K* noise value is robust: the system arrives into the same 
state independently on the time scale separation of evolving 
quantities. The fixation time, however, depends strongly on 
the applied Q parameter. Accordingly, the results presented 
in the next sections are corresponding to the Q = 1 case. 

Starting the system from a random distribution of strategies 
and noise parameters the iteration of the above elementary 
processes governs an evolutionary process that can be quanti- 
fied by recording the fraction pc(t) of cooperators as well as 
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the portion vk^ (t) of players following the dynamical rule Ki. 
In one time unit (called MC step, in short, MCS) each player 
has a chance once on average to adopt a strategy and/or noise 
value of a neighbor. 

Finally we mention that for homogeneous dynamical rule 
(K x = K, Vx) the present model is identical to a previously 
studied spatial model [20] as discussed briefly below. Further- 
more, for homogeneous strategy distribution [s x — C (or D), 
Vx] all the players receive the same payoff [P x = 4 (or 0), 
Vx] and the adoption probability ([l} becomes uniform, i.e., 
W = 1/2 and the evolution of the K x distribution becomes 
identical to the process described by the n-state voter model 



III. SELECTION OF NOISE LEVEL ON SQUARE 
LATTICE 



First we briefly outline the outcome if all players have the 
same adoption parameter (K x = K, Vx) , i.e., if only the strat- 
egy distribution can evolve J20l l2l"l 12211 . In this case, on lat- 
tices as underlying graphs, we can usually distinguish three 
regions when increasing b for a fixed K parameter. Only 
cooperators remain in the final state after a transient time if 
b < b c i(K). On the contrary, for b C 2{K) < b, defectors 
prevail. While the C and D strategies coexist in the region 
b c i(K) < b < b C 2(K) where the concentration of cooper- 
ators decreases continuously from 1 to when increasing b 
from b c i(K) to b C 2(K). The continuous phase transitions at 
the two threshold values belong to the directed percolation 
universality class 111 ll 12611 . For uniform K values the major 
feature of this system can be summarized in a b — K phase 
diagram 112011 where the curves b c i(K) and b C 2(K ) denote the 
phase boundaries separating the homogeneous D, the coex- 
isting C + D, and the homogeneous C phases as it is partly 
illustrated in the inset of Fig.Q~]or Fig.[6j3. 

The systematic analysis of the proposed evolutionary pro- 
cess has justified the existence of a distinguished noise 
level K*(b) within the (C + D) coexistence region for 
any fixed b (on square lattice the corresponding region is 
6 min = 0.940(3) < b < 6 max = 1.078(1) where the 
borders are the minimum and maximum values of b c i(K) 
and b C 2(K) functions.). Figure QJ shows what happens if 
K* G {Ki, K2, ■ ■ ■ , K n } for n = 5. In this case the fi- 
nal state with players using the same K* rule is reached after 
about 1000 MCS independent of L, if L > 500. 

The speed of relaxation towards the homogeneous K x = 
K* state is strongly influenced by the initial set of possible Ki 
values. The lower plot of Fig. Q] illustrates a situation where 
the four additional Ki values are very close to the distin- 
guished K* value. Although the evolution of adoption rules is 
still straightforward, but the relevant increase in the relaxation 
time may be related to the smaller difference in driving force 
favoring K x = K* at the expense of others. Similar slow- 
ing down can be observed if n is increased while the maximal 
value of Ki is limited. Anyway, the upper limitation of the 
range of possible Ki values [(assuming that max(iQ) > K*] 
does not influence the final results because the strategy adop- 
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FIG. 1 : (color online) The time dependence of the fraction of initial 
Ki values {uk ) demonstrate the selection of the distinguished adop- 
tion rule K* (indicated by arrow in the inset) if four other rules (Ki) 
are permitted in the initial state on the square lattice at 6 = 1.05 
(upper panel). The inset, as part of the b — K phase diagram, shows 
the initial K values chosen from the coexistence (C + D) and ab- 
sorbing (D) region as well. Lower panel illustrates the same evolu- 
tion if the additional four dynamical rules are closer to K* , namely, 
Ki = K* + 0.05(i - 2) for i = 1, n = 5. For both cases the 
linear system size was L — 1000. 



tion using the largest Ki value dies out first (see Fig.Q]). 

The above results raise the question: What happens if the 
initial set of dynamical rules is out of the range of (C + D) 
coexistence phase? (It also happens if b < b m [ n or b > 6 max -) 
In this case the cooperators (or defectors) die out soon and the 
adoption of noise levels becomes random as it is described by 
the voter model predicting a behavior dependent on the spatial 
dimension d. Namely, algebraically growing domains of the 
same Ki values occur on the one-dimensional lattice (d = 1), 
the typical size of homogeneous domains increases with ln(t) 
if d = 2, and the system remains inhomogeneous for d > 3 
IL251 l27ll . According to the above theoretical prediction a very 
slow (logarithmic) coarsening will be observed. This case is 
demonstrated in Fig. [2] where the initial Ki values are in the 
all D phase. For the given simulation the cooperators have 
died out at t = t ex t = 2600 MCS resulting the conditions of 
voter model. Afterwards a coarsening without surface tension 
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starts represented by huge fluctuations in functions. The 
semi-log scale demonstrates clearly that the absorbing state is 
reached after a long coarsening time. (To drive the eye we 
have plotted a logi function, too.) In the final state all the 
players use the same noise level. In principle, any of the per- 
sistent rules (Ki) can invade the whole finite system with a 
probability vjd (text) due to the fluctuations. 




[MCS] 



FIG. 2: (color online) Time dependence of the fraction of the three 
Ki values on square lattice at b = 1.05 and L = 1000. The in- 
set shows the position of the initial Ki values in the homogeneous 
D region. The arrow points to the time i ex t when the C strategy 
became extinct. Solid line shows a log t function illustrating loga- 
rithmic coarsening in the final period. 

Several simulations were performed to study the cases 
where the initial Ki values are positioned in both sides of the 
range of (C + D) coexistence. The results indicated a quali- 
tatively similar behavior plotted in Fig. [2] 

Beside the large fluctuations for t > t cxt Fig. [2] shows a 
smooth and deterministic variation in Vfc t (t) within the period 
where both C and D strategies exist. This feature has inspired 
us to study the effect of strategy mutation on the Darwinian 
selection of noise levels in the regions where only cooperators 
(b < 6 r „i n ) or defectors (b > 6 max ) would remain alive if the 
evolution was controlled by only imitations. For this purpose, 
in a few simulations, the above mentioned evolutionary rule 
is extended by allowing each player to change her strategy 
(from C to D or conversely) with a small probability e. Fig- 
ure [3] shows that slightly above b max the Darwinian selection 
favors also a distinguished rule K*(b = 1.1) ~ 0.25 (in the 
presence of rare mutations) that can be considered as the ana- 
lytical continuation of K* (b) obtained within the coexistence 
region. Evidently, the favored rule depends on both b and e. 
As the spreading of the distinguished noise level is catalyzed 
by the mutants, the speed of this process vanishes with e. 

If we take b < b m i n value, being in the SH region, the intro- 
duction of small mutations will result a fixation value that can 
be also considered as an analytical continuation of fixed K* 
values from the b > b m [ n interval. Namely, only small K w 
values survive and the fixation time increases drastically as b 
decreases. Instead of the further analysis of the effect of muta- 
tions henceforth our attention will be focused on the behavior 
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FIG. 3: (color online) Time dependence of the fraction of ten Ki 
values, distributed equidistantly from K\ = 0.05 to Kio = 0.5, on 
square lattice in the presence of rare mutations (s = 0.02) at 6 = 1.1 
and L = 1000. Labels indicate the closest Ki values to K* . 



of K*(b) in the (C + D) coexistence phase in the absence of 
mutation. 

The above mentioned features of the selection of dynami- 
cal rules refer to serious technical difficulties (related to the 
long runs on large lattice) in the determination of K* (b) with 
an adequate accuracy. It turned out, however, that this quan- 
tity can be evaluated more efficiently by repeating simula- 
tions with only two possible Ki values on small systems. 
In this case (n = 2) we choose a simple notation, namely 
Ki = K - AK/2 and K 2 = K + AK /2. For small sizes the 
random initial state evolves rapidly into one of the absorbing 
phases where all the players use uniformly the value K\ or 
Ki. Starting from different (random) initial states these simu- 
lations are repeated many times (typically iV r = 2000) and the 
preference of the second rule (if 2) is measured by the quan- 
tity / = g{Ki) — g(Ki) where g(Ki) is the probability that 
the evolution ends up in the absorbing state with players using 
uniformly the rule Ki. As the simulations hold until reaching 
one of the absorbing states therefore g(K\) + g(Ki) = 1 and 
/ varies from -1 to +1. Similar quantities are used frequently 
for the investigations of finite systems within the framework 
of Moran process j28ll . 

Figure [4] demonstrates the results of these investigations 
when varying the value of K and AK for a fixed system size 
L and b. According to MC data the position of K*, where the 
sign of / changes, is independent of the value of AK within 
the statistical error. / varies linearly with K in the close vicin- 
ity of K — K* [more precisely, / ~ (K* — K)]. Naturally, 
larger AK involves larger difference in the "driving force" fa- 
voring one of them. This fact can be demonstrated by the data 
collapse when choosing a more suitable scale for the vertical 
axes. 

Figure|4]shows that / decreases smoothly if K is increased 
for L = 40. Evidently, this transition becomes sharper if we 
choose larger systems as demonstrated in Fig. [3] According to 
these data the system size influences only the absolute value 
of / and the functions f(K) becomes zero at K = K* inde- 
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FIG. 4: MC results for the measure / of preference between two 
dynamical rules as a function of if on a square lattice of 40 x 40 sites 
for b = 1.05 if AK = 0.005 (circles), AK = 0.01 (open boxes), 
and AK — 0.02 (closed boxes). Inset: data collapse when scaling 
/ by 1/ AK suggesting the identification of the preferred adoption 
rule is independent of the magnitude of AK. 

pendently of the system size if it is large enough. In the limit 
L — > oo a step-like transition (from 1 to -1 at A*) is expected. 
To sum up, the position of K* can be determined by using 
only one AA" value at small system size, which reduces the 
necessary fixation time drastically. 

The positive (negative) value of / indicates the parameter 
region where the system can evolve towards larger (smaller) 
K x values through weak mutations in K x during the adoption 
processes. Within this context A* in Fig.|4]can be considered 
as an attractor. 




FIG. 5: The measure of preference between the dynamical rules of 
A"i and A2 versus A on a square lattice for different sizes [L = 160 
(open circles), 80 (closed boxes), 40 (open boxes), and 20 (closed 
circles)] if AK = 0.01 and b = 1.05. 

The above features are utilized in the accurate determina- 
tion of A* for a given value of b. In our previous work fl9tl 
these calculations are repeated to determine the distinguished 
rule (A*) for different b values within the range of weak Pris- 



oner's Dilemma. Now this analysis is extended to the whole 
range of b (i.e., 6 m j n < b < b max ) and the results are summa- 
rized in Fig. [(J). 




FIG. 6: (color online) The surface of stationary cooperator density 
(p) as a function of b and A parameters when homogeneous K distri- 
bution is supposed on square lattice. Closed circles and thick (green) 
line show the position of K* (6) favored by the Darwinian selection 
of dynamical rules for fixed b. In the lower panel A* (6) and the 
phase boundaries (separating the homogeneous D, the mixed C + D, 
and the homogeneous C phases) are projected on the b ~ K plane. 
For comparison, the positions of local maxima in the cooperator den- 
sity p for fixed b are denoted by the dotted (blue) line. The dashed 
(black) line is just to mark the border between weak PD and SH 
games at b = 1. 

In order to get deeper understanding about the fixation of 
noise levels, the cooperator density [p( A)] profiles are illus- 
trated above the whole b — K plane (see Fig.|6^). The plotted 
p(K) curves are obtained for fixed b. In the simulations we 
used the following system parameters: system sizes of 10 5 - 
10 6 players, relaxation time of 5- 10 5 -10 6 MCS and 10 5 -2-10 5 
more MCS for averaging to get the steady state cooperator 
density values. Larger system sizes and longer simulation 
times were needed in the vicinity of the critical points. The 
statistical errors of the plotted data are comparable with the 
line thickness. 

Each plotted p(A) curve (for fixed b) shows a local maxi- 
mum in the range of weak PD (b > 1). The plot of Fig. [6J3 
demonstrates clearly that the values of K*{b) are close to the 
site where the cooperator density p has a local maximum for 
the given b. Although the position of the local maxima in the 
average payoff and p (for fixed b) are distinguishable, the dif- 
ference between these positions is small and comparable with 
the size of symbols as discussed in lfl9ll . 

Within the region of SH game (b < 1) Fig. [6] shows a 
significantly different behavior. First we emphasize that the 
surface p(K, b) exhibits a valley within the coexistence re- 
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gion, otherwise p = 1. More precisely, for fixed b the curve 
p(K ) differs from 1 within the coexistence region [namely, if 
K c i(b) < K < K C 2{b) assuming that b > b m i n ] and has a 
local minimum. It turned out that within this region the Dar- 
winian selection of noise values prefers also a distinguished 
rule that is positioned at the left edge of the "valley", that is, 
K*(b) = K c i(b). This means, that if we study a system with 
only two initial rules (Ki and K2 both within the coexistence 
region) then the smaller one will spread in the whole system 
in the final state. 

For uniform dynamical rule (K x — K) on the square lattice 
with nearest neighbor interactions both b cl (K) and b c2 (K) 
(the phase boundaries in Fig. [(J) go to 1 if K tends to either 
or 00 in such a way that one can observe an optimal noise 
level for the cooperators in the PD region and another one for 
the defectors in the SH games. In other words p(K) curves 
have a local maximum (minimum) in the PD (SH) region. As 
it was shown by a previous study l2lll the local maximum 
of cooperation level is related to the absence of overlapping 
triangles (three-site cliques) of interaction graph. The com- 
parison of the mentioned surface and the fixation values of Ki 
for both games in Fig. [6] suggest that the possible evolution of 
strategy adoption rule will drive the system into a state that 
ensures closely the optimal cooperation level independently 
of the studied dilemma game. In the subsequent section we 
will consider another type of connectivity structure exhibiting 
a slightly different behavior. 



IV. RESULTS ON KAGOME LATTICE 

In real human connectivity structures a relevant portion 
of the neighbors of a player x is also connected to each 
other (that is, the so-called clustering coefficient is sufficiently 
large) [29]. The main effect of this topological feature can 
be well investigated if the players are distributed on the sites 
of the two-dimensional kagome lattice where each player has 
also four neighbors. The latter feature makes possible to ex- 
clude the additional impact by changing coordination number. 
The systematic investigation of the evolutionary PD game on 
this connectivity structure has explored a basically different 
phase diagram in comparison with that observed on the square 
lattice [20]. It turned out that the upper threshold value of 
temptation [b C 2(K)] decreases monotonously from 3/2 to 1 
if K is increased from to 00. Due to the different limit of 
b C 2 (K) threshold values, one can also expect basically differ- 
ent behavior in the Darwinian selection of noise levels for the 
PD. 

The monotonous K dependence of p in the low noise limit 
is related to presence of the overlapping triangles that sup- 
port the spreading of cooperators through the lattice 12111 . 
In fact, the MC simulations indicate similar behavior for 
many other regular connectivity structures (including two- and 
three-dimensional lattices and some other regular networks) 
where the overlapping triangles span the whole system. Thus 
the kagome lattice can be considered as a sample representing 
the latter type of interaction graphs. 

Here it is worth mentioning that the low Ki (or K) values 



lead to diverging simulation times and cause technical diffi- 
culties in the quantification of the low noise behavior. The 
subsequent results were extracted from a set of simulations 
where Ki > K m i n — 0.002 is chosen for all i. 

As we observed in case of square lattice topology, the fixa- 
tion of evolving K values is in close connection to the surface 
of maximal cooperation when using homogeneous dynamical 
rules (K x = K). Therefore, the same surface on K — b plane 
is determined for kagome lattice as plotted in Fig.|7^. 




FIG. 7: The same plot as Fig.|6]for kagome lattice. The cooperation 
level (upper panel) has minima in the SH region and maxima in the 
PD area. The maximum at positive K decreases as b increases and 
is replaced by a maximum at K = if b exceeds a threshold value 
(bth = 1.182). Bullets and thick green line mark the main attractor 
while the positions of maximal cooperation level are also denoted by 
dotted blue line. 

For the PD game {b > 1) the p(K) curves (for fixed b) vary 
continuously from nonzero value of p(K = 0) until reach- 
ing the absorbing state (p = 0). We can distinguish differ- 
ent types of behaviors although the accurate separation of the 
corresponding regions of parameter is prevented by the above 
mentioned technical difficulties. From the extrapolation of 
the low noise behavior monotonously decreasing p(K) can 
be concluded if K is increased from zero to infinity at 3/2 > 
b > 1.4. In the subsequent region, 1.4 > b > b th = 1.182, 
the curves p(K) possess only one local maximum close to 
K = 0. At b = b t h, with a sudden jump there appears an- 
other local maximum while the other local maximum close to 
K = still exist to b = 1. The absolute maximum is the one 
belonging to larger K value. This behavior indicates that the 
triangle percolation added a support for cooperation for lower 
noise values and the cooperator density profile can be derived 
as the superposition of a plateau originated from the triangle 
percolation effect and the normal one-peak profile of a lattice. 
In the region of SH game (b < 1) the p(K ) curves are resem- 
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FIG. 8: The same plot as Fig.[5]but using kagome lattice topology 
at b = 1.1 and AK — 0.01. The system sizes are 3 x L x L = 
3 x 20 x 20 (closed circles), 3 x 40 x 40 (open boxes), and 3 x 80 x 80 
(closed boxes). The plot demonstrates clearly the existence of two 
attracting fixed points at K* = and at K* = 0.212 where the 
border of attraction of fixation values is at A" sep = 0.038. 



bling those discussed in the previous section. As well as for 
the square lattice the function b c i (K) goes to 1 if if tends to 
zero or infinity. 

Our conjecture, based on the close relation of the fixed 
noise level and the optimal cooperation level of homogeneous 
K system, is completely supported by the evolution of adop- 
tion rules on kagome topology, too. More precisely, in the SH 
region the Ki values of coexistence C + D phase drift to the 
minimal K value to reach K c i(b) that ensures the maximal 
cooperation (p = 1). Technically, if we choose two initial 
Ki values from the coexistence region than the lower K value 
will spread eventually in the whole population. At high values 
of b in the PD region the final value of noise parameter is al- 
ways the lowest among the initial set signalling K* « 0. This 
feature is related to the fact that cooperation level always has 
a local maximum at K ss 0. Decreasing b, however, a bifur- 
cation occurs at b = 1.185: besides the K* » fixed point a 
new attractor appears located at a positive K value. The pres- 
ence of two attractors are demonstrated in Fig. [8] where the 
measure of preference, is plotted by means of K. In this b re- 
gion the initial Ki values always destine at K* ks if they are 
below K sep (b) value. (The position of separator, as the border 
of basins of attractors will be discussed in the next section.) If 
Ki > K sep , the adoption rules converge to the above men- 
tioned K* ^ value. Naturally, if all Ki > K c , means all Ki 
values are from the absorbing D phase, similar behavior can 
be observed as shown in Fig. [2] 

An interesting situation occurs when the initial Ki values 
are distributed from the whole (0, K c ) interval. In this case, 
only K* 7^ fixed point survives, signalling that the latter 
is the stronger attractor. As expected, the position of positive 
K* attractor is close to the K value where maximal coopera- 
tion level is measured for homogeneous K model. Summing 
up our observations for both representative topologies and for 
both dilemma games, it is concluded that the system sponta- 



neously will evolve to a state that is favorable for cooperation 
if the adjustment of noise (strategy adoption) is allowed. 



V. DYNAMICAL CLUSTER APPROXIMATIONS 

Beside the MC simulations, we performed dynamical clus- 
ter approximations |5] on kagome lattice. The choice of 
kagome lattice for this type of investigation was motivated by 
its simplicity. To highlight the difficulties in the application 
of these sophisticated technique first we emphasize that nei- 
ther the mean-field (one-site) nor the pair (two-site) approx- 
imations were capable to give an adequate description of the 
homogeneous K model, particularly in the low noise limit. 
On the square lattice, a higher level of approximation (four- 
site approach) is needed to reproduce qualitatively the results 
of MC simulations. Furthermore, a more demanding nine-site 
level is necessary to reach an adequate accuracy. On the con- 
trary, the three-site (triangular) approximation can reproduce 
quantitatively well the results of MC simulations on kagome 
lattice. 

Now we briefly survey this method for a simple case where 
each site x has only two states (s x — C or D) and later we 
give the main details of the extension to the four-state systems 
that is necessary to describe our present model (for a more 
detailed description see the papers JHIsilEll] w i m further ref- 
erences therein). Within the framework of this approximation 
the system is characterized by all possible configuration prob- 
abilities P3(s a , 8/3, s 7 ) on a block of three neighboring sites 
forming a regular triangle on the kagome lattice (a, (3 and 
7 are site labels within the three-site block). In the present 
case the eight possible configurations can be given using by 
only three parameters due to the symmetries and compati- 
bility conditions. These configuration probabilities are deter- 
mined by solving a set of differential equations expressing the 
derivative of P3(s a , sp, s 7 ) with respect to time (denoted as 
P3{s a , sp, s 7 )). The main difficulties in the application of this 
method comes from the fact that the contribution of the ele- 
mentary processes (here strategy adoption between two neigh- 
boring sites) to the quantity P3 (s a , sp, s 7 ) depend on the con- 
figuration containing seven sites. 




FIG. 9: Strategy adoption from s y to s x (marked by an arrow) will 
modify the three-site configuration probabilities on triangles marked 
by grey color. All the other neighbors influencing the payoff differ- 
ence are denoted by grey circles. 
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Figure [9] illustrates an elementary process with the neigh- 
borhood affecting the probability of strategy adoption from 
site y to x. This process decreases p 3 (s x , s y , S3) and simulta- 
neously increases p 3 {s v , s y> S3) with a value 



Ap = Wpr(si, . . . ,85) 



(2) 



where W describes the payoff dependence defined by (Q} and 
P7(si, . . . , S5) denotes the seven-site configuration probabil- 
ity. For the the present connectivity structure the latter quan- 
tity can be approximated by a Bayesian formula: 



PT{Sl, ...,8 5 ) 



g3(£li S 2 , Sy)p 3 (Sy, S x , S 3 )p 3 (S X , Sj, s 5 ) 
Pl(Sx)Pl(Sy) 



(3) 

where the one-site configuration probability in the denomina- 
tor can be also expressed by the three-site configuration prob- 
abilities as 



Pi 



(4) 



Summarizing the contribution of all the possible elementary 
processes affecting the values of p 3 (s a , s@, s 7 ) one can de- 
rive a set of differential equations. Now we do not wish to 
display the huge formulae depending only on the three-site 
configuration probabilities due to approximative formula (0. 
Instead of it we emphasize that one can easily develop a com- 
puter algorithm to collect systematically all the contributions 
and the resultant formulae can be used to find the stationary 
solution(s) numerically for any values of parameters. In the 
knowledge of the stationary solutions of the three-site con- 
figuration probabilities the most relevant characteristic of the 
stationary state can be evaluated, for example, p = pi(C). 
Using this approach the b — K phase diagram was reproduced 
qualitatively well in ll20[|2lll . 

In the present work this method is extended by substitut- 
ing (s a , K a ) for s a where s a = C or D and K a — K% — 
K - AK/2 or K 2 = K + AK/2. This extension does not 
influence the applicability of the above described method. Us- 
ing only two initial adoption rules, it was possible to keep 
the number of the feasible configurations low enough so that 
the numerical solution of the differential equation system was 
fairly fast. It turned out that this method is capable to repro- 
duce all the relevant features characterizing the Darwinian se- 
lection of the dynamical rules. For the quantitative analysis 
we used small AK = 0.001 values to evaluate the position of 
the attractor (K*) and the separatrix for any values of b. 

The results are summarized and compared with the MC data 
in Fig. [TO] The predictions of the generalized mean-field ap- 
proximation for the PD region are in excellent agreement with 
MC data. As both approaches stated, there are two attractors 
in a restricted interval of b where a separatrix marking the bor- 
der of basins of attractors tends to b = 1 if K decreases. It also 
means that the basin of attractor K* rj keeps getting wider 
with increasing b. When separatrix reaches the other fixed 
point the latter disappears resulting a unique fixed point in the 
large b region. Turning to the SH side, the dynamical cluster 
approximation predicts increasing K* fixed point as we leave 




FIG. 10: (color online) Positions of fixed points of adoption rules on 
b — K plane as predicted by MC simulation (top) and three-site clus- 
ter approximation (bottom) for kagome lattice. Bullets, connected 
by solid (green) line, show the fixed point given by MC simulations. 
The position of separatrix in the two-attractor region is denoted by 
dashed (green) line. The borders of phases are marked by dashed- 
dotted (blue) lines. In the lower panel solid (red) lines show the po- 
sition of attractors while separatrix is denoted by dashed (red) lines. 
Dashed-dotted (blue) lines mark the borders of phases. 



b = 1, which is again in nice agreement with the MC results. 
Further decreasing b, however, the approximation predicts a 
small two-attractor region and finally the K*(b) function co- 
incides with the phase boundary where p becomes 1 . 

Before evaluating the predictions of the present cluster 
mean-field approach, we should stress that the simpler (two- 
state) version of the three-site approximation, which is valid 
for homogeneous K case, cannot describe correctly the func- 
tions p(K) as well as b c \{K) in the low K limit. Namely, 
the border of all C phase tends to b = 0.75 instead of b = 1 
when K — > 0. (According to this, defectors can survive the 
deterministic limit when b < 1.) Such a qualitative failure 
of the approximation might have been the consequence of the 
small number of independent variables. In other words, the re- 
stricted freedom prevents the approximation to find the valid 
solution. At the same time, if we increase the number of in- 
dependent variables by letting different K i values for players, 
the approximation is already capable to find the relevant solu- 
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tion (at least in the vicinity of b = 1). Our argument is sup- 
ported by the fact that the five-site approximation for the ho- 
mogeneous K model can also describe the behavior b c i — > 1 
in the K — > limit. The relevant increase in 6 c i can remove 
the artifact(s) occurred at low b values. Summing up, the ex- 
tension of the three-site approximation by letting different K% 
values was capable to indicate the correct results of the more 
sophisticated approximation based on larger cluster of sites. 

VI. DISCUSSION AND OUTLOOK 

We studied evolutionary Prisoner's Dilemma and Stag Hunt 
games on two representative two-dimensional lattices. The 
underlying structures were the square lattice and the kagome 
lattice exemplifying spatial connectivity structures without or 
with triangle percolation. We analyzed the simultaneous evo- 
lutions of strategy and a player specific noise parameter used 
by individuals in the strategy imitation processes. It turned out 
that players prefer to use the same distinguished noise level, 
namely the same way of strategy adoption and the fixed points 
of the adoption parameter are systematically close to those pa- 
rameter values which are the most favorable for cooperation 
for the PD games. It implies that the evolution of the adop- 
tion parameter drives the population to a state which assures 
substantial cooperation within the coexistence region. In the 
region of SH game the maximum average payoff is achieved 
at the peripheries of the coexistence region. It is shown that in 
the latter case the Darwinian selection favors the edge of co- 
existence region where the noise level (K) is lower. The pref- 
erence of the lower noise level can also be observed within 
the PD region because the distinguished rule K* was always 
smaller than that where the maximum average payoff (or co- 
operator density) occurs for homogeneous dynamical rules at 
fixed payoff. The above results raise many general questions 
about the main features of states (including many aspects of 
the model itself) favored by the Darwinian selection. Here we 
emphasize that the Darwinian selection seems to be more ef- 
ficient within the coexistence region where the simultaneous 
evolution of strategies accelerates the evolution of adoption 
rules, too. The latter observation is confirmed by simulations 
where the coexistence in maintained artificially by introduc- 
ing rare mutations in the systems. 

The above investigations have required to improve the ac- 
curacy particularly in the low noise limit where the relaxation 
time diverges. The systematic investigation of the coopera- 
tor's density versus b (temptation to choose defection) and K 
have indicated different types of non-analytical behaviors in 
the limit b — ► 1 and K — > as Figs. [6] and Q show. At 
the same time we found qualitatively similar behavior on both 
structures in the region of SH game. Although most of these 
features can be reproduced qualitatively well by the dynam- 
ical cluster methods on sufficiently large cluster of sites, we 
think that further analysis is required to clarify the effects of 
the sucker's payoff S, topology of connectivity network, and 



dynamics (e.g., when irrational choices are forbidden) on the 
non-analytical behavior appeared here at b = 1 and K = 0. 

The present investigations have expanded the research of 
coevolutionary game theory by applying Darwinian selection 
among a continuous set of noise levels in dynamical rules 
used by the individuals while other relevant ingredients of the 
model were fixed. In most of the previous studies of the co- 
evolutionary games only two ingredients of the system were 
allowed to evolve simultaneously. Recently, Van Segbroeck 
et al. have studied a model where the players could mod- 
ify three quantities: their strategy, connections, and the way 
how a new partner is chosen 1 2411 . In the light of the latter 
model one can ask what happens if the payoff parameter b 
is also considered as an individual property (b x ) and it can 
be adopted from the neighbors as well as the strategy s x and 
dynamical rule K x , The preliminary MC results have indi- 
cated that within the strategy coexistence region of the b — K 
plane the system evolves toward K*(b) with favoring smaller 
b values. It is found that the system evolves fast toward a 
state where players use game of the lower b value and subse- 
quently the homogenization in K x will be done as described 
above. This means that the Darwinian selection prefers SH to 
PD game, that is, the system develops an environment where 
the mutual cooperation (providing the maximum average pay- 
off) can be achieved more conveniently. Similar results were 
reported by Worden and Levin [32] and also by Fort IU5I1 who 
studied models with different adoptions of payoff parameters. 
Evidently, other results can be obtained if the simultaneous 
evolution of the connectivity structure (interaction and learn- 
ing networks IB3IP is also possible. 

In principle, all the ingredients of the multi-agent coevolu- 
tionary games can be the subject of Darwinian selection if we 
assume that these features are determined by the participants. 
In that case the system can evolve towards a strategy distri- 
bution with a proper connectivity structure, payoff parame- 
ter, adoption rule(s), mutation, etc., that are preferred by the 
Darwinian selection. As a consequence the given Darwinian 
selection will show us the preferred features (or parameter val- 
ues) we can fix when exploring the effect of other properties 
113411 . To be more precise, the K* value(s) can be suggested in 
numerical simulations if one wish to fix the noise level. 

Finally we would like to mention that in many real systems 
besides the evolving individual's features there are external 
conditions affecting the system behavior. For example, the 
noise itself can arise from external sources as it was inves- 
tigated by Traulsen IB5I1 and Perc [36]. Further systematic 
investigations are required to clarify the effect of the external 
noise or any other questions mentioned above. 
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